Sub-Gaussian tails for the number of triangles in G{n,p) 
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CN ■ Abstract 

Let X be the random variable that counts the number of triangles in the random graph 
■ G{n,p). We show that for some absolute constant c, the probability that X deviates from 

its expectation by at least AVar(X)^/^ is at most e~'^^ , provided that n^^(lnn)^° < P < 
n-i/2(lnn)-i°, A = w(lnn) and A < min{(np)i/2, n-3/4p-3/2^ ^i/6|_ 
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1 Introduction 



In this paper we consider the standard Erdos-Renyi random graph G{n,p), in which every edge of 
Kn appears independently with probability p. We study the number of triangles, denoted by X, in 
G{n,p). This is a classical topic in the theory of random graphs [TVfOt lllffH] . Our starting point 
is the following question regarding the distribution of X. This question has been explicitly raised 
J> ' and studied by Vu |13yi4j and more recently by Kannan [?]• 

m ; 

. Question 1.1. For which p and A does X have the sub-Gaussian tails 
(N ; Pr[\X - E[X] I > A Var(X)i/2] < g-'^^', (1) 

■ where c is an absolute positive constant? 

O ■ Rucinski [11] showed that if 1/2 > p = u!{n~ ) then Y^j.(^xy/2 tends in distribution to the normal 

distribution A'^(0, 1). This implies that ([T]) holds for the same range of p for every constant A > 0. 

^ , Vu [13] showed that there is a constant ci > such that ([1]) holds for p = a;(n~^/^lnn) and 

^ I cinp^ > X = a;(lnn). Vu [H] also showed for every constant C2 > there is a constant C3 > 

such that ([1]) holds if p > n^^/'^+^2 g^j^j < A < n'^^. Recently, Kannan [7j showed that there are 

constants 04,05 > such that if c^n^^ Inn < p < C4n~^^^, then for A = 0{np), Pr[|X — E[X] | > 
AVar(X)V2] < ^^^-cX f^^. 

some absolute constant c. In this paper we improve upon Kannan's 
result both by expanding the range of p and by giving a better upper bound on the tail. More 
importantly, our result complements in a way Vu's results, in that it addresses the question above 
with regard to the case where n~^(lnn)^'^ < p < n^^/^(lnn)~^'^. Formally, we prove the following. 

Theorem 1.2. Inequality (OP is valid if n~^{liiny^ < p < n^^/^(lnn)^^'^, A = a;(lnn) and A < 
m.in{{np)^/'^ , n^^/'^p^^/'^ , n^/^} . 

The proof of Theorem 11.21 employs an iterative invocation of McDiarmid's inequality (which is 
stated at the next subsection), and a certain iterative view of the random graph G{n,p). The proof 
is given in the next section. 
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1.1 McDiarmid's inequality 



Let ai,a2, ■ ■ ■ ,am be independent random variables with Oi taking values in a set Ai. Let / : 
Y\iLi Ai satisfy the following Lipschitz condition: if two vectors a, a' £ YYiLi differ only in 
the ith. coordinate, then \f{a) — f{a')\ < ai. McDiarmid's inequality [T0| states that the random 
variable W = /(ai, 02, ... , am) satisfies for any t > 0, 

Pr[|Ty - E[W] I > < 2 exp - —-^ . 



2 Proof of Theorem D 



Fix p and A within the ranges asserted by the theorem. It is safe to assume, and we will use 
this implicitly in the proof, that n > uq for some sufficiently large constant no which we do not 
explicitly state (otherwise the theorem is trivial). The proof of the theorem relies on the analysis 
of the following iterative process, which gives an alternative definition of G{n,p): 

Definition 1. Let e < 1/1000 be a constant such that = p for some integer I < Inn. Let Gq := 

Kn- Given Gi, i > 0, construct Gj+i by taking every edge in Gi independently with probability e. 
End upon obtaining G/. 

It is clear that Gi has the same distribution as G{n,e^). In particular, by the definition of /, 
Gj has the same distribution as G{n,p). Let Xi be the random variable that counts the number 
of triangles in Gi and note that X = Xj. Let l^^e be the number of sets {e',e"} C Gi such that 
{e, e', e"} is a triangle. Let Zj^e := Yi,e • l[e G Gi], where l[e £ Gi] is the indicator function for the 
event that e £ Gi. (In words, Zi^^ is equal to l^^e if e S Gj and is equal to otherwise.) We will 
use r lb s below to denote the interval [r — s,r + s]. The following lemma, as we soon show, can be 
easily used to prove the theorem. 

Lemma 2.1. There is a constant cq > s.t. for all < i < L the following holds. Assume that 

• X, e Qe^^ ± (Qe^Hinp)-^/^ + O.lXV^') . 

• Ve G Kn. Yi^e < max{4ne2i + Xy/4ne^\ A^}. 

• EeeK^ Zl < nV^(l + m-V4) + io(«)e3^. 

Then each of the following items occurs with probability at least 1 — 2e~^^^'^ . 

• X,+i G (;;)e3(i+i) ± ((n)£3(*+i)(^ ^ l)(np)-3/2 + 0.1AVn3e3(*+i)). 

• Ve G Kn. Yi+i^e < max{4ne2(*+i) + AV4ne2{i+i), A^}. 

• ZeeK. Zli,e < nV(^+i)(l + (^ + l)n-V4) + 10(^)^3(^+1). 

Proof of Theorem \1.2l The preconditions in Lemma |2. II hold trivially for i = 0. Since / < Inn and 
A = u;(lnn), we thus get from Lemma [2. II that with probability at least {l—6e~'^°^^)^ > i — e~'^-^'^o^^, 

X = Xje (""X^' ± ((j^e''L{np)''/^ + O.IXV^] C E[X] ± 0.2X{npf', 
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where the last containment follows since K[X] = {^)p^ and p = , and from the upper and lower 
bounds on / and A respectively. This implies the validity of Theorem 11.21 as one can easily verify 
that Var(X) > O.l(rzp)^ for our choice of p. ■ 

It remains to prove Lemma |2.1[ Fix a constant cq > 0, sufficiently small so that it satisfies our 
claims below. Fix < i < I and assume that we are given Gi and that the preconditions in the 
lemma hold for i. We show that each of the three consequences in the lemma holds with probability 
at least 1 - 2e-^o^^ 

First consequence. Clearly, 

We need to bound from above the probability that Xj+i deviates from its expectation by more 
than ti := (3)e^(*+^) (np)~^/^ + 0.1(1 — e'^-^)\Vn^e^(''+^h Every edge e £ Gi has an outcome which 
is either the event that e G Gj+i or not. Note that Xj+i depends on the outcomes of the edges 
in Gi. Also note that changing the outcome of a single edge e G Gi can change Xj+i by at most 
Zi^e and that J2eeGi ^ie ~ YleeKn ^ie- Therefore, by McDiarmid's inequality and by the assumed 
upper bound on J2eeK^ ^le^ 

Pr[|X.„ - E[X.«1 1 > f,| < 2e.p ( - ^J^) < 2»p ( - ,„3,3. ) ■ (2) 

We have two cases. The first case is that e* > n~^/^. In that case 2n^e^* + 2n'^e^* < 4n^e^*. Using 
this and the fact that ti > 0.1n^e^(*"'"^)(np)~^/^, we get from ^ that 

where the last inequality follows from the fact that e* > n~^/-^ and A < n~^/^p^^/^. 

Next consider the case that < n^^^"^. In that case, 2n^e^* + 2n^e^^ < 4n^e^\ Since ti > 
0.05AVn3e3(i+i)^ we get from ([2]) that 

Pr[|X,« - EK«1 1 > *,] < 2exp ( - ^^^^^^J^) < 2.-«- 

Second consequence. Fix e G -fC„. If l^^e ^ -^^ then clearly l^+i,e ^ ^i,e ^ with probabil- 
ity 1. Otherwise we have Yi^^ > A^. Hence by assumption, l^^g ^ 4ne^* + XV 4ne^* . This implies that 
IE[li+i,e] < 4ne2(«+i) + eAV4ne2(»+i). Note that 4ne2» > 0.25A2, since otherwise yi,e < A^. Hence 
IE[>"i+i,e] < 4ne2{»+i)+eA\/4ne2(mT < 12ne2(«+i). This in turn implies, again using 4ne2* > 0.25A^ 
that E[yi+i,e] + 0(A\/4ne2{i+i)) = 0(ne2(*+^)). Therefore, by Chernoff's bound, the probability 
that li+i,e deviates from its expectation by more than (l-e)A\/4ne2{i+i) is at most e'^""^'. Thus, 
for a fixed e G -fC„, we have that yj+i,e < max{4ne^^*"'""^^ + A\/4ne2(*+i)^ A^} with probability at 
least 1 - e-^'^o^'. It now follows from the union bound and the fact that A = a;(lnn) that with 
probability at least 1 - n^e'^'^^^ > 1 - 2e-'=oA''^ y._^^^^ < max{4ne2(»+i) + AVine^C^, A^} holds 
for all e G Kn simultaneously, as needed. 
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Third consequence. We start by estimating K[Zf_^_-^ g] from above for a fixed e G Kn- Clearly, 
if e ^ then E[Zf_^^J = Zi^e = 0. So assume e G G^. ' If e ^ Gj+i then trivially E[Zf^^J = 0. 
Conditioning on the event that e € Gj+i, ^j+i,e is a bonimial random variable with mean e^^i,e and 
variance e^{l — e^)Zj^e- Therefore, conditioning on e G Gi+i, we have that E[Zj?^_j g] = E[^j+i_e] ^ + 
Var(Zi+i_e) < e^Zf^ + e'^Zi^e- Adding the fact that the event e G Gj+i occurs with probability e we 
can conclude that without the conditioning on e G Gj+i, K[Zf_^i g] < e^Zf^ + £^^i,e- 

Let Z := YleeKn ^i+i,e- linearity of expectation and the previous paragraph, 

eeKn eeKn 

Every triangle in Gi is counted exactly 3 times in the sum Yle<^K„ Zi,e ^o Yle<^K„ Zi,e ~ "^-^i- 
Also, Xi < 2(2)e'^*, and this follows from the assumed estimate on Xi and the bounds on p, A and /. 
This imphes that J2eeKn ^^^he ~ 3e^-^i < 6(3)e'^*^*"'"^). Using this, the above upper bound on E[Z] 
and the assumed upper bound on X]eeA:„ ^ie S^* that 
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It remains to estimate from above the probability that Z deviates from its expectation by more 
than t2 := n^e^(*+^)n~^/^ + (^^^e^^'^''^^\ Clearly Z depends on the outcome of the edges in Gi. Fix 
e £ Gi and let ^{g/ g//} be the sum over all Zi^g sets {e', e"} such that {e, e', e"} is a triangle in Gj. 
We claim that changing the outcome of e can change Z by at most 

{e',e"} {e',e"} 



< Zi^eYi,e + 6Zi,e • max{4ne2* + XV4ne'^\ A^} 

< 7Zi,e • max{4n£2* + X^/An^^\ A^}. 

Indeed, if e ^ Gj+i then Zj+i,e = and otherwise Zj+i^e < -^i.e- Hence, changing the outcome of e 
can change Zf_^^ ^ by at most Z^^^g. In addition, for every triangle {e,e',e"} in Gi, changing the 
outcome of e can change Zj_|_i g/ and g// each by at most 1. Since Zj_|_i g/ < Zj g/, this implies 
that changing the outcome of e can change Zf^^ g, by at most (Zj g/ + 1)^ — Zf^, < 2Zj g/ + 1. The 
same argument also shows that changing the outcome of e can change Zf_^-^^ g„ by at most 2Zj g// + 1. 
Lastly note that changing the outcome of e can affect only the sum Zf_^i e"'"X^{e' e"} ^i+i e'"'"'^iH-i e'" 

If 4ne2* < A^ then max{4ne2* + Av^ine^^, A^} < 2X'^. If on the other hand 4n£^* > A^ then 
max{4ne^* + XV^ne^, A^} < 8ne^*. Hence max{4ne^* + A\/4ne^, A^} is at most max{8ne^*, 2A^}. 
Also note that J^eeG^ ^ie ~ J2e€K„ ^ie- Therefore, given the discussion above, it follows from 
McDiarmid's inequality and the assumed upper bound on ^^^j^^ Zf^ that 

Pr[|Z - E[Z] I > ta] < 2exp 



EeeG,(7^^,e-max{8ne2^2A2}) 



2 



t2 



- ^^""P' 100max{64n2£4\4A4}. (n^eSi + n3£3i) )■ 
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Assume that e' > n ^^"^ . In that case, n^e^* + n^e^* < 2n^e^*. In addition, trivially t2 > 
^4g5(j+i)^-i/4_ Therefore, from ^ it follows that 

^8^10(i+l)^-l/2 



Pr[|Z-E[Z]| >t2] < 2exp 



100max{64n2e4i^4A4} . 2n'^e 



5i 



< 2 exp 

< 2e 



^ 3 • 5 5 ?' 



max{n%4i^A4} 



where the last inequality follows since e' > n and A < n^/^. 

Next assume that < n"^/^. In that case, max{64n2e''*, 4A^} = 4A^ and n^e^*+n^e^' < 2n^e^*. 
In addition, trivially, t2 > 0.1n^e'^*^*^^\ Therefore, from 



Pr[|Z - E[Z] I > ta] < 2 exp 



0.01n6e6(i+i) 
400A4 • 2n3e3» 



< 2exp( ^ 

< 2e-^«^', 



where the last inequality follows since A < (np)^/"^ < (ne')^/^. 



3 Concluding remarks 

Combining Theorem 11.21 and a result of Vu [13], we have that for every p > n^^(lnn)^'^, if p does 
not satisfy n~^/2(lnn)~^'' < p < 0{n~^^'^lnn), then one can take some A = A(n) that goes to oo 
with n so that the probability that X deviates from its expected value by at least A Var(X)^/2 is at 
most e~'^^ , for some absolute constant c. One question that remains open is what happens when 
p = n~^/2. That is, can one show that ([T]) is valid for p = n"^/^ and for some A = A(n) that goes 
to oo with n? 

In the proof of Theorem 11.21 we had to assume that A = a; (Inn). In fact we probably could 
have proved Theorem 11.21 had we assumed that A > C In n for some sufhciently large constant C. 
However, our argument would have failed if we took A = o(lnn). This is because if A = o(lnn), 
then Lemma l2.1l onlv implies that with probability at least 1 — e-0-^^o>^^ ^ x E E[X] it u>{X){np)^/^ , 
and this does not imply the theorem. This naturally raises the following question: is it true that ([T]) 
holds for n^^(lnn)"'^'^ < p < n~^^'^{\nn)^^^ and, say, A = Vlnn? 

Finally, we note that our argument for the proof of Theorem 11.21 can be generalized so as to 
prove a rather general concentration result for functions with large Lipschitz coefficients. This is 
the subject of a forthcoming paper. This concentration result can provide some new sub-Gaussian 
tail bounds for the number of copies of H in G{n,p), for a large family of graphs H. 
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